5 research outputs found
Defect prediction with bad smells in code
Background: Defect prediction in software can be highly beneficial for
development projects, when prediction is highly effective and defect-prone
areas are predicted correctly. One of the key elements to gain effective
software defect prediction is proper selection of metrics used for dataset
preparation. Objective: The purpose of this research is to verify, whether code
smells metrics, collected using Microsoft CodeAnalysis tool, added to basic
metric set, can improve defect prediction in industrial software development
project. Results: We verified, if dataset extension by the code smells sourced
metrics, change the effectiveness of the defect prediction by comparing
prediction results for datasets with and without code smells-oriented metrics.
In a result, we observed only small improvement of effectiveness of defect
prediction when dataset extended with bad smells metrics was used: average
accuracy value increased by 0.0091 and stayed within the margin of error.
However, when only use of code smells based metrics were used for prediction
(without basic set of metrics), such process resulted with surprisingly high
accuracy (0.8249) and F-measure (0.8286) results. We also elaborated data
anomalies and problems we observed when two different metric sources were used
to prepare one, consistent set of data. Conclusion: Extending the dataset by
the code smells sourced metric does not significantly improve the prediction
effectiveness. Achieved result did not compensate effort needed to collect
additional metrics. However, we observed that defect prediction based on the
code smells only is still highly effective and can be used especially where
other metrics hardly be used.Comment: Chapter 10 in Software Engineering: Improving Practice through
Research (B. Hnatkowska and M. \'Smia{\l}ek, eds.), pp. 163-176, 201
Bottlenecks in Software Defect Prediction Implementation in Industrial Projects
Case studies focused on software defect prediction in real, industrial software development projects are extremely rare. We report on dedicated R&D project established in cooperation between Wroclaw University of Technology and one of the leading automotive software development companies to research possibilities of introduction of software defect prediction using an open source, extensible software measurement and defect prediction framework called DePress (Defect Prediction in Software Systems) the authors are involved in. In the first stage of the R&D project, we verified what kind of problems can be encountered. This work summarizes results of that phase
Predictive power of two data flow metrics in software defect prediction
Data flow coverage criteria are widely used in software testing, but there is almost no research on low-level data flow metrics as software defect predictors. Aims: We examine two such metrics in this context: dep-degree (DD) proposed by Beyer and Fararooy and a new data flow metric called dep-degree density (DDD).Method: We investigate the importance of DD and DDD in SDP models. We perform a correlation analysis to check if DD and DDD measure different aspects of the code than the well-known size, complexity, and documentation metrics. Finally, we perform experiments with five different classifiers on nine projects from the Unified Bug Dataset to compare the performance of the SDP models trained with and without data flow metrics. Results: 1) DD is noticeably correlated with many other code metrics, but DDD is not correlated or is very weakly correlated with other metrics considered in this study; 2) both DD and DDD are highly ranked in the feature importance analysis; 3) SDP models that use DD and DDD perform better than models that do not use data flow metrics. Conclusions: Data-flow metrics: DD and DDD can be valuable predictors in SDP models